Extraction of Contextual Relevance of Web Documents

نویسندگان

Nidhi Tyagi

Rahul Rishi

چکیده

The crawled web pages should be organized in a fashion where they are more understandable to machine, for producing the results which are meaningful and relevant. The set of web pages can be categorized into different contextual sense if the crawler has the technique to understand their meaning and the domain identification. The contextual relevance of the web documents can be known, if the frequent occurring patterns of the keywords in the web page are identified. This can be achieved through data mining technique for generating frequent patterns, using FPGrowth. It will help in deducing the set of keywords of the documents and this knowledge is added in the knowledge store which will further facilitate in the building the ontology for the crawled web pages and organizing them and thus increasing the rank of the document.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relevant term suggestion in interactive web search based on contextual information in query session logs

This paper proposes an effective term suggestion approach to interactive Web search. Conventional approaches to making term suggestions involve extracting co-occurring keyterms from highly ranked retrieved documents. Such approaches must deal with term extraction difficulties and interference from irrelevant documents, and, more importantly, have difficulty extracting terms that are conceptuall...

متن کامل

MIKE: An Interactive Microblogging Keyword Extractor using Contextual Semantic Smoothing

Social media, such as tweets on Twitter and Short Message Service (SMS) messages on cellular networks, are short-length textual documents (short texts or microblog posts) exchanged among users on the Web and/or their mobile devices. Automatic keyword extraction from short texts can be applied in online applications such as tag recommendation and contextual advertising. In this paper we present ...

متن کامل

A Client Side Tool for Contextual Web

This thesis describes the design and development of an application that uses information relevant to the context of a web search for the purpose of improving the search results obtained using standard search engines. The representation of the contextual information is based on a Vector Space Model and is obtained from a set of documents that have been identified as relevant to the context of th...

متن کامل

Contextual Query Expansion for Acquiring Web Documents

Query expansion is an information retrieval technique in which new query terms are added to the original query terms to improve search performance. Contextual query expansion is major issue in today‟s era. In this paper, contextualization is achieved by performing document extraction and terms extraction activities to the particular domain information source. User query is expanded using docume...

متن کامل

Contextual Concept Discovery Algorithm

In this paper, we focus on the ontological concept extraction and evaluation process from HTML documents. In order to improve this process, we propose an unsupervised hierarchical clustering algorithm namely “Contextual Concept Discovery” (CCD) which is an incremental use of the partitioning algorithm Kmeans and is guided by a structural context. Our context exploits the html structure and the ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Extraction of Contextual Relevance of Web Documents

نویسندگان

چکیده

منابع مشابه

Relevant term suggestion in interactive web search based on contextual information in query session logs

MIKE: An Interactive Microblogging Keyword Extractor using Contextual Semantic Smoothing

A Client Side Tool for Contextual Web

Contextual Query Expansion for Acquiring Web Documents

Contextual Concept Discovery Algorithm

عنوان ژورنال:

اشتراک گذاری